17 research outputs found
Prompt-tuning in Controlled Dialogue Generation
Recent years have witnessed a prosperous development of dialogue response generation since the advent of Transformer. Fine-tuning pretrained language models for different downstream tasks has become the dominant paradigm in Natural Language Processing (NLP). However, fine-tuning requires storing a full copy of parameter states for every task, which is memory-consuming and expensive to serve when working with large-scale models with billions of parameters like GPT-3.
Meanwhile, prompt-tuning has become an increasingly popular parameter-efficient method for steering large pretrained language models to various tasks. Most of the prompting techniques are applied in language understanding and assuming fixed prompts for all data samples within a task. Therefore, there arises an urgent need to exploit the ability of prompt-tuning in open-domain dialogue generation where data samples may vary greatly within a task.
In this thesis, we present a novel, instance-specific prompt-tuning algorithm for dialogue generation. Specifically, we generate prompts based on instance-level control code, rather than the conversation context, to explore their impact on controlled dialogue generation. Experiments on popular open-domain dialogue datasets, evaluated with both automated metrics and human evaluation, demonstrate that our method is superior to prompting baselines as well as other lightweight controlled generation methods, and comparable to fine-tuning with less than 10% of total parameters
Attribute Controlled Dialogue Prompting
Prompt-tuning has become an increasingly popular parameter-efficient method
for adapting large pretrained language models to downstream tasks. However,
both discrete prompting and continuous prompting assume fixed prompts for all
data samples within a task, neglecting the fact that inputs vary greatly in
some tasks such as open-domain dialogue generation. In this paper, we present a
novel, instance-specific prompt-tuning algorithm for dialogue generation.
Specifically, we generate prompts based on instance-level control code, rather
than the conversation history, to explore their impact on controlled dialogue
generation. Experiments on popular open-domain dialogue datasets, evaluated on
both automated metrics and human evaluation, demonstrate that our method is
superior to prompting baselines and comparable to fine-tuning with only 5%-6%
of total parameters.Comment: Accepted at ACL 2023 In Finding
PEER: A Comprehensive and Multi-Task Benchmark for Protein Sequence Understanding
We are now witnessing significant progress of deep learning methods in a
variety of tasks (or datasets) of proteins. However, there is a lack of a
standard benchmark to evaluate the performance of different methods, which
hinders the progress of deep learning in this field. In this paper, we propose
such a benchmark called PEER, a comprehensive and multi-task benchmark for
Protein sEquence undERstanding. PEER provides a set of diverse protein
understanding tasks including protein function prediction, protein localization
prediction, protein structure prediction, protein-protein interaction
prediction, and protein-ligand interaction prediction. We evaluate different
types of sequence-based methods for each task including traditional feature
engineering approaches, different sequence encoding methods as well as
large-scale pre-trained protein language models. In addition, we also
investigate the performance of these methods under the multi-task learning
setting. Experimental results show that large-scale pre-trained protein
language models achieve the best performance for most individual tasks, and
jointly training multiple tasks further boosts the performance. The datasets
and source codes of this benchmark are all available at
https://github.com/DeepGraphLearning/PEER_BenchmarkComment: Accepted by NeurIPS 2022 Dataset and Benchmark Track. arXiv v2:
source code released; arXiv v1: release all benchmark result
A Case of Paradoxical Embolism Causing Anterior Spinal Cord Syndrome and Acute Myocardial Infarction Following the Intradiscal Oxygen-Ozone Therapy
We report a case of a 66-year-old female who burst into flaccid paralysis of the lower extremities, accompanied by loss of pain and temperature sensation below T4 level, during an oxygen–ozone injection for disc herniation. Half an hour later, she suffered from chest pain. Magnetic resonance imaging (MRI) showed long segment hyperintensity in the thoracic spinal cord from T2 to 10 level on sagittal T2-weighted images (T2WI). The electrocardiogram (ECG) showed ST-segment elevation in V1–V6 leads. She was diagnosed with spinal cord infarction and ST-elevation myocardial infarction (STEMI). Transthoracic echocardiography with saline contrast showed existence of a large patent foramen ovale (PFO) correlating with the detection of massive microbubbles in the left atrium. We discuss the potential role of paradoxical embolism in spinal cord infarction and myocardial infarction
A Deep Learning Application for Deformation Prediction from Ground-Based InSAR
Ground-based synthetic aperture radar interferometry (GB-InSAR) has the characteristics of high precision, high temporal resolution, and high spatial resolution, and is widely used in highwall deformation monitoring. The traditional GB-InSAR real-time processing method is to process the whole data set or group in time sequence. This type of method takes up a lot of computer memory, has low efficiency, cannot meet the timeliness of slope monitoring, and cannot perform deformation prediction and disaster warning forecasting. In response to this problem, this paper proposes a GB-InSAR time series processing method based on the LSTM (long short-term memory) model. First, according to the early monitoring data of GBSAR equipment, the time series InSAR method (PS-InSAR, SBAS, etc.) is used to obtain the initial deformation information. According to the deformation calculated in the previous stage and the atmospheric environmental parameters monitored, the LSTM model is used to predict the deformation and atmospheric delay at the next time. The phase is removed from the interference phase, and finally the residual phase is unwrapped using the spatial domain unwrapping algorithm to solve the residual deformation. The predicted deformation and the residual deformation are added to obtain the deformation amount at the current moment. This method only needs to process the difference map at the current moment, which greatly saves time series processing time and can realize the prediction of deformation variables. The reliability of the proposed method is verified by ground-based SAR monitoring data of the Guangyuan landslide in Sichuan Province
A Deep Learning Application for Deformation Prediction from Ground-Based InSAR
Ground-based synthetic aperture radar interferometry (GB-InSAR) has the characteristics of high precision, high temporal resolution, and high spatial resolution, and is widely used in highwall deformation monitoring. The traditional GB-InSAR real-time processing method is to process the whole data set or group in time sequence. This type of method takes up a lot of computer memory, has low efficiency, cannot meet the timeliness of slope monitoring, and cannot perform deformation prediction and disaster warning forecasting. In response to this problem, this paper proposes a GB-InSAR time series processing method based on the LSTM (long short-term memory) model. First, according to the early monitoring data of GBSAR equipment, the time series InSAR method (PS-InSAR, SBAS, etc.) is used to obtain the initial deformation information. According to the deformation calculated in the previous stage and the atmospheric environmental parameters monitored, the LSTM model is used to predict the deformation and atmospheric delay at the next time. The phase is removed from the interference phase, and finally the residual phase is unwrapped using the spatial domain unwrapping algorithm to solve the residual deformation. The predicted deformation and the residual deformation are added to obtain the deformation amount at the current moment. This method only needs to process the difference map at the current moment, which greatly saves time series processing time and can realize the prediction of deformation variables. The reliability of the proposed method is verified by ground-based SAR monitoring data of the Guangyuan landslide in Sichuan Province
Microdeletion in distal PLP1 enhancers causes hereditary spastic paraplegia 2
Abstract Objectives Hereditary spastic paraplegia (HSP) is a genetically heterogeneous disease caused by over 70 genes, with a significant number of patients still genetically unsolved. In this study, we recruited a suspected HSP family characterized by spasticity, developmental delay, ataxia and hypomyelination, and intended to reveal its molecular etiology by whole exome sequencing (WES) and long‐read sequencing (LRS) analyses. Methods WES was performed on 13 individuals of the family to identify the causative mutations, including analyses of SNVs (single‐nucleotide variants) and CNVs (copy number variants). Accurate circular consensus (CCS) long‐read sequencing (LRS) was used to verify the findings of CNV analysis from WES. Results SNVs analysis identified a missense variant c.195G>T (p.E65D) of MORF4L2 at Xq22.2 co‐segregating in this family from WES data. Further CNVs analysis revealed a microdeletion, which was adjacent to the MORF4L2 gene, also co‐segregating in this family. LRS verified this microdeletion and confirmed the deletion range (chrX: 103,690,507–103,715,018, hg38) with high resolution at nucleotide level accuracy. Interpretations In this study, we identified an Xq22.2 microdeletion (about 24.5 kb), which contains distal enhancers of the PLP1 gene, as a likely cause of SPG2 in this family. The lack of distal enhancers may result in transcriptional repression of PLP1 in oligodendrocytes, potentially affecting its role in the maintenance of myelin, and causing SPG2 phenotype. This study has highlighted the importance of noncoding genomic alterations in the genetic etiology of SPG2
SERS Sensing Using Graphene-Covered Silver Nanoparticles and Metamaterials for the Detection of Thiram in Soil
Multilayer
hyperbolic metamaterial (HMM)-based SERS substrates
have received special consideration because they accommodate various
propagation modes such as surface plasmonic polaritons (SPP). However,
the SPP modes are difficult to generate in HMM due to their weak electric
field enhancement. In this article, we designed novel SERS substrates
consisting of graphene-covered AgNPs and HMM. The graphene-covered
AgNPs work as an external coupling structure for hyperbolic metamaterials
due to this structure exhibiting significant plasmonic effects as
well as unique optical features. The localized surface plasmonic resonance
(LSPR) of the graphene-covered AgNPs excited the SPP and thus formed
a strong hot spot zone in the nanogap area of the graphene. The Raman
experiment was performed using rhodamine 6G (R6G) and crystal violet
(CV), which showed high stability and a maximum enhancement factor
of 2.12 × 108. The COMSOL simulation further clarified
that enhanced SERS performance was due to the presence of monolayer
graphene and provided an atomically flat surface for organic molecules
in a more controllable manner. Interestingly, the proposed SERS structure
carries out quantitative detection of thiram in soil and can satisfy
the basic environmental need for pesticide residue in the soil
MOESM1 of Isolation of nontuberculous mycobacteria from soil using Middlebrook 7H10 agar with increased malachite green concentration
Additional file 1. Additional material